Oncogenic role and potential regulatory mechanism of topoisomerase IIα in a pan-cancer analysis

Topoisomerase IIα (TOP2A) plays an oncogenic role in multiple tumor types. However, no pan-cancer analysis about the function and the upstream molecular mechanism of TOP2A is available. For the first time, we analyzed potential oncogenic roles of TOP2A in 33 cancer types via The Cancer Genome Atlas (TCGA) database. Overexpression of TOP2A was existed in almost all cancer types, and related to poor prognosis and advanced pathological stages in most cases. Besides, the high frequency of TOP2A genetic alterations was observed in several cancer types, and related to prognosis in some cases. Moreover, we conduct upstream miRNAs and lncRNAs of TOP2A to establish ceRNA networks in kidney renal clear cell carcinoma (SNHG3-miR-139-5p), kidney renal papillary cell carcinoma (TMEM147-AS1/N4BP2L2-IT2/THUMPD3-AS1/ERICD/TTN-AS1/SH3BP5-AS1/THRB-IT1/SNHG3/NEAT1-miR-139-5p), liver hepatocellular carcinoma (SNHG3/THUMPD3-AS1/NUTM2B-AS1/NUTM2A-AS1-miR-139-5p and SNHG6/GSEC/SNHG1/SNHG14/LINC00265/MIR3142HG-miR-101-3p) and lung adenocarcinoma (TYMSOS/HELLPAR/SNHG1/GSEC/SNHG6-miR-101-3p). TOP2A expression was generally positively correlated with cancer associated fibroblasts, M0 and M1 macrophages in most cancer types. Furthermore, TOP2A was positively associated with expression of immune checkpoints (CD274, CTLA4, HAVCR2, LAG3, PDCD1 and TIGIT) in most cancer types. Our first TOP2A pan-cancer study contributes to understanding the prognostic roles, immunological roles and potential upstream molecular mechanism of TOP2A in different cancers.

factor in the tumorigenesis and development by regulating various pathways including AKT and ERK pathway in colon cancer, β-catenin pathway in pancreatic cancer as well as MAPK pathway in lung adenocarcinoma (LUAD) 7,9,10 . Besides, it has been reported that TOP2A participates in the regulation of tumor progression by interacting with other genes such as MDM4 and CENPF 11,12 .
Tumor microenvironment (TMA) including immune cells and tumor stromal cells also makes influence in tumor occurrence and progression. Effector T cells, dendritic cells (DCs), M1 macrophages and natural killer cells function as antineoplastic factors, while cancer associated fibroblasts (CAFs) and M2 macrophages act as tumor-promoting factors in the processes of tumor proliferation, tumor invasion, and tumor metastasis 13,14 . Besides, immune suppressor cells including regulatory T cells (Tregs) play a part in the disruption of immune surveillance by inhibiting the proliferation of B and T cells as well as disrupting DCs' antigen presentation 13 . Meanwhile, immune escape via immune checkpoints has also been viewed as a vital mechanism in tumor occurrence and progression. Immunotherapy targeting immune checkpoint is a hot spot in current clinical research and it is considered as one of effective means in antitumor therapy. Programmed cell death ligand 1 (PD-L1/ CD274), cytotoxic T lymphocyte-associated antigen 4 (CTLA-4) and programmed cell death 1 (PD-1/PDCD1) are three frequently targeted immune checkpoints, but patients' response rates are still limited. Hence, more targets need to be explored for the expansion of therapeutic range and improvement of existing response rates. Three immune checkpoints, namely T cell immunoglobulin domain and mucin domain 3 (TIM-3/HAVCR2), lymphocyte activation gene 3 (LAG-3) as well as T cell immunoreceptor with Ig and ITIM domains (TIGIT), are being under clinical trials. LAG-3 negatively regulates the activation of T cells and synergizes with PDCD1 in the mediation of T cell exhaustion 15 . Meanwhile, HAVCR2 has a certain association with T cell exhaustion, and TIGIT plays an important role in the limitation of T cell inflammation 16,17 .
A competing endogenous RNA (ceRNA) hypothesis described that lncRNAs, mRNAs, and transcribed pseudogenes compete binding miRNAs via acting as natural miRNA sponges by virtue of sharing no less than one miRNA response element 18 . The ceRNA network targeting TOP2A has significant effects on tumor occurrence and development in several tumor types. For instance, it has been reported that TOP2A is modulated by zinc finger protein 148 via miR-101, miR-144, miR-335 as well as miR-365 in the cell proliferation of colorectal cancer 19 . In terms of gastric cancer, FAM230B has been confirmed to increase TOP2A expression via miR-27a-5p in tumor development and metastasis 20 . Although there have been numerous studies focusing on TOP2A's oncogenic role and regulatory networks in tumors, we found that there was no comprehensive systematic analysis of TOP2A from the pan-cancer aspect available. Meanwhile, more ceRNA regulatory networks need to be explored for in-depth understanding of TOP2A's molecular mechanism in multiple tumor types. To supplement the existing molecular mechanisms, we predicted the upstream miRNAs of TOP2A and corresponding lncRNAs, and investigated underlying ceRNA networks. In consideration of TMA's important role in tumors, we explored the potential relationship between TOP2A and immune infiltration. Moreover, we also investigated the correlation between TOP2A and above six immune checkpoints to provide more potential options for tumor therapy.
Taken together, we comprehensively conduct a pan-cancer analysis about TOP2A based on the clinical data of The Cancer Genome Atlas (TCGA) database for the first time, including gene expression, clinical survival prognosis, pathological stage, genetic alteration, immune cell infiltration, and immune checkpoints. Moreover, we conduct upstream miRNAs and lncRNAs of TOP2A to establish the ceRNA network in kidney renal clear cell carcinoma (KIRC), kidney renal papillary cell carcinoma (KIRP), liver hepatocellular carcinoma (LIHC) and LUAD by expression analysis, survival analysis, and correlation analysis. We believe that our work could assist in perfecting existing molecular and immunological mechanisms of TOP2A, exploring TOP2A's prognostic value as well as providing more possibilities for antitumor therapy.

Methods
Data source and processing. The Cancer Genome Atlas (TCGA; http:// cance rgeno me. nih. gov) is a landmark cancer genomics program, and characterized over 20,000 primary cancer and normal samples in 33 cancer types until Oct, 2021. We collected gene expression RNAseq data and survival data from multiple cancer and normal samples of TCGA database by UCSC Xena (https:// xenab rowser. net/) 21 . Genotype-tissue expression (GTEx; http:// commo nfund. nih. gov/ GTEx/) is a gene expression data from 54 normal tissue sites across nearly 1000 people by RNA sequencing. We used normal samples from GTEx when there were not enough (less than 5) normal samples from TCGA in specific cancer types, to compare TOP2A expression from cancer and normal tissue.
TOP2A expression profiles. Perl script and R language were used to organize the data. R language was used to analyze TOP2A expression in cancer types with enough (more than 5) normal tissues from TCGA project. We take |log2FC| > 1 and an adjusted p-value < 0.05 as the cut-off criterion for further TOP2A-ceRNA network analysis. For certain tumor types without enough normal tissues from TCGA project, we used the "Expression on Box Plots" module of the GEPIA2 (Gene Expression Profiling Interactive Analysis) web server (http:// gepia2. cancer-pku. cn) to obtain TOP2A expression in these tumor tissues and the normal tissues of the GTEx and TCGA database 22 . Survival analysis. We analyzed overall survival (OS) and disease-free survival (DFS) based on TOP2A expression (50% high-expression group and 50% low-expression group) in different cancer types by the "Survival Plots" module of the GEPIA2 web server. GEPIA uses Log-rank test, also known as the Mantel-Cox test, for hypothesis test. Furthermore, we performed OS analysis using "survival" package in R by the Kaplan-Meier survival curve and Log-rank test based on TOP2A expression (50% high-expression group and 50% low-expression group) in cancer types with enough (more than 5) TCGA adjacent normal tissues, and only the tumor types Prediction of upstream miRNA/lncRNA. The Starbase database (https:// starb ase. sysu. edu. cn/) was employed to predict miRNA-lncRNA interactions and miRNA-mRNA interactions 24 , and the results were supported by Ago CLIP-seq data. Interactions of miRNA-target were predicted by at least 2 programs from PITA, RNA22, miRmap, DIANA-microT, miRanda, PicTar and TargetScan. The interactions of miRNA-lncRNA were predicted by using miRanda program.
Immune infiltration. TIMER2.0 (http:// timer. cistr ome. org/) web server is a comprehensive resource for systematical analysis of immune infiltrates across diverse cancer types 25,26 . We used it to explore the associations between TOP2A expression and immune infiltrates (B cells, CD4+ T cells, CD8+ T cells, cancer associated fibroblasts, myeloid dendritic cells, macrophages, monocytes, NK cells, Tregs, and neutrophil) across TCGA tumors. Moreover, we used "gene correlation" module of TIMER2.0 for exploring the correlations between TOP2A and expressions of immune checkpoints (CD274, CTLA4, HAVCR2, LAG3, PDCD1 and TIGIT). The p-values and partial rho values were obtained via the purity-adjusted Spearman's rank correlation test.
Correlation analysis. The correlations between TOP2A expression and TOP2A -bound miRNAs, TOP2A -bound miRNAs and upstream lncRNAs, and TOP2A expression and upstream lncRNAs were analyzed by R language. The p-values and partial rho values were obtained by the Spearman's rank correlation test.

Results
TOP2A was over-expressed in most cancers. We first analyzed TOP2A expression in cancer types with enough adjacent normal tissues in TCGA project. As graphed in Fig However, the expression of TOP2A was lower in acute myeloid leukemia (LAML) compared with normal tissues (p < 0.001), and we did not obtain a significant difference in testicular germ cell tumors (TGCT).
Genetic alteration analysis of TOP2A. We further investigated TOP2A genetic alteration information through the online database cBioPortal in various tumor tissues dominated from TCGA datasets (Fig. 4A). Genetic alterations in TOP2A were dominated by amplification and mutation types, which differs in different cancer types. However, deep deletion was the primary type in ACC and UVM. The high frequency of TOP2A alterations was observed in ESCA (10.99%, mainly amplification), STAD (10.91%, mainly amplification), UCEC (8.88%, mainly mutation), SKCM (8.33%, mainly mutation), ACC ( www.nature.com/scientificreports/ mainly amplification) and OV (5.14%, mainly mutation). Figure 4B showed the types, sites and case numbers of TOP2A genetic mutations across different cancer types. The T215P missense mutation was the most common type, and it was detected in 7 OV cases. The R1435Gfs*13 truncating mutation was found in 5 STAD cases and 1 UCEC case. The S1483L missense mutation was detected in 3 UCRC cases, 1 BLCA case, 1 GBM case, and 1 LGG case. Sequentially, we explored the potential association between genetic alteration of TOP2A and clinical prognosis in different cancer types. The UCEC patients with TOP2A alterations had better OS (p = 0.028), progression-free survival (PFS, p = 0.026), and disease-specific survival (p = 0.048) than those with unaltered TOP2A, but not in DFS (p = 0.190, Fig. 4C). However, ACC patients with TOP2A alterations had poorer PFS (p = 0.042), and disease-specific survival (p = 0.029), than those with unaltered TOP2A, and OS nearly reached statistically significant (p = 0.057, Fig. 4D). In OV patients, we observed similar poorer survival in OS (p = 0.007), PFS (p = 0.002), and disease-specific survival (p = 0.003) in altered TOP2A patients compared to unaltered TOP2A patients (Fig. 4E).
Predictive analysis of upstream miRNA of TOP2A. We further conducted predictive ceRNA networks in cancer types with enough adjacent normal tissues of TCGA project. We chose the cancer types in which TOP2A was associated with poor OS by Log-rank test and the Kaplan-Meier survival curve simultaneously ( Table 1). The starBase database was used for predicting upstream miRNAs of TOP2A, and these 38  www.nature.com/scientificreports/ corelated with poor prognosis (Fig. 5L). MicroRNA-139-5p had a tendency but not reached statistical significance (p = 0.091, Fig. 5L).
Predictive analysis of upstream lncRNAs of TOP2A-bound miRNAs. The starBase database was used to predict upstream lncRNAs of the above-mentioned miRNAs with expression and survival significance in KIRC, KIRP, LIHC and LUAD. According to ceRNA mechanism, the upstream lncRNAs should be negatively correlated with target miRNAs. Moreover, the expression of lncRNAs should be higher than normal tissues. In KIRC, Small Nucleolar RNA Host Gene 3 (SNHG3) was significantly negatively correlated with miR-139-5p (R = − 0.24, p < 0.001), and it was positively correlated with TOP2A (R = 0.19, p < 0.001, Fig. 6A). Besides, the expression of SNHG3 was higher in KIRC tissues than normal tissues (p < 0.001, Fig. 6B). In addition, the prognosis of KIRC patients with high SNHG3 expression showed poor prognosis (p < 0.001, Fig. 6C).

Discussion
As one of the important factors of DNA unlinking, TOP2A has been involved in the occurrence and development of many different cancers, including bladder urothelial carcinoma, prostate cancer, breast cancer, colon cancer and liver cancer. However, it remains unclear whether TOP2A plays a role in pathogenesis of different cancers through some common molecular mechanisms. Therefore, we performed the pan-cancer analysis of TOP2A across 33 different cancer types. Our analysis corroborated that TOP2A was overexpressed in all 18 cancer types with enough normal tissues from TCGA project. The overexpression of TOP2A was also existed in most cancers when we added the normal tissues from GTEx program. TOP2A was only decreased in LAML. However, we found the chronic myelogenous leukemia K562 cell line was used as matched normal tissue from GTEx when compared with LAML cells. LAML cells derived from immature hematopoietic cells in bone marrow (myeloblast, promyeloytes et al.), so the immature hematopoietic cells from healthy person should be the reasonable controls. Therefore, the LAML control tissues from GTEx exist limitations. In addition, we also provide substantial evidence in support of the prognostic values of TOP2A. The Log-rank test revealed that high expression of TOP2A was a risk factor in ACC, KIRC, KIRP, LGG, LIHC, LUAD, MESO, PAAD and SCKM, and a protective factor in COAD and THYM. Part of these findings are consistent with other studies 8,[27][28][29] , but the most of these results derived from the analysis of TCGA or GEO transcriptome data. However, some studies verified that the higher expression of TOP2A has nothing to do with the worse prognosis in these cancer types. Roca et al. showed that high expression of TOP2A   www.nature.com/scientificreports/ has no prognostic efficacy of OS in ACC patients, but was associated with longer time to progression (TTP) after EDP-M scheme 30 . Therefore, in addition to the transcriptome data based on TCGA or GEO databases, more experiments are needed to verify the prognostic value of TOP2A in tumors.
Commonly altered at both copy number and expression level in tumor cells, TOP2A is considered as a key player of decatenation checkpoint. The alterations of TOP2A may cause defective decatenation checkpoint, and then lead to the chromosome instability and additional imbalances of chromosomes in tumor cells, which results in increased cell survival, proliferation, carcinogenesis as well as the tumors' aggressiveness 2 . We analyzed the genetic alteration of TOP2A, and discovered that TOP2A alterations in tumors were dominated by mutation and amplification types. In terms of mutation, it has been reported that tumors with p.K743N mutation of TOP2A have potentially oncogenic indel mutations including frameshift mutations in tumor suppressors PTEN and TP53 and an activating insertion in BRAF 31 . In our study, the most common mutation type was T215P missense mutation, but there has been no report about this mutation up to date, so it can be a direction which we can explore in the future. Moreover, in most cases, gene amplification, that is the increase of gene copy number, can lead to gene overexpression which refers the increase of post-transcriptional RNA. In this study, we did demonstrate the overexpression of TOP2A in most tumor types. However, the interaction between TOP2A amplification and expression is of great complexity, and need to be further investigated for the reason that the protein overexpression is reported not always a result of TOP2A amplification 32 .
In this study, we first presented evidence of the potential ceRNA network based on TOP2A in tumors. Importantly, miRNA-139-5p was found as a common potential upstream regulator of TOP2A in KIRC, KIRP, LIHC and LUAD. As we known, there was only one report about the relationship between miRNA-139-5p and TOP2A in tumor before 33 . Although we did not use experimental verification, we conduct the relationship by expression analysis, survival analysis, and correlation analysis based on RNAseq data. Therefore, our finding is meaningful for understanding the common regulatory mechanism of TOP2A. Moreover, we found SNHG3-miR-139-5p-TOP2A network in KIRC, KIRP and LIHC. The lncRNA SNHG3 was verified as an oncogenic factor in the tumorigenesis and development by regulating miR-139-5p in various ways, including Notch pathway in ovarian cancer 34 , MYB transcription factor in gastric cancer 35 , and BMI1 protein in LIHC 36 . Importantly, Zhang et al. verified the same SNHG3-miR-139-5p-TOP2A ceRNA network by functional experiment in KIRC 33 , which demonstrate the reliability of our prediction based on transcriptome data. Besides, we found SNHG1/SNHG6/ GESC-miR-101-3p-TOP2A network in LIHC and LUAD. There was no report about the regulatory relationship between miRNA-101-3p and TOP2A in tumor, but the regulatory function of SNHG1 on miR-101-3p was verified in osteosarcoma 37 . However, there exist several limitations of the prediction through those programs in the Starbase database. All of those prediction programs are based on existing knowledge about the expression and functions of non-coding RNAs (ncRNAs). Currently, numerous ncRNAs remain unknown, so existing knowledge about ncRNAs need to be continuously improved by the explorations of potential ncRNAs. Moreover, although several ncRNAs have been explored, their functions and corresponding regulatory mechanisms may differ in case of changes in expression pattern, structure as well as interacting proteins 38 . As the largest class of ncRNAs, lncRNAs are considered to have highly diverse functions and regulatory mechanisms, which increases their complexity 39 . Additionally, low abundance of most lncRNAs in cells and the tissue specificity add the difficulty of investigating the interactions between lncRNAs and proteins or nucleic acids 39,40 . Up to date, only a  www.nature.com/scientificreports/ small subset of lncRNAs have gotten identified and functionally described in the literatures 39,41 . Therefore, the regulatory networks we discovered based on the existing Ago CLIP-seq data are required to be further confirmed. TMA has been confirmed to have a significant effect in tumor progression and influence patients' outcomes as well as chemotherapy drug resistance 42,43 . Thus, in-depth exploration of the potential relationship between TOP2A expression and immune infiltration is needed for the supplement of immunological mechanism and improvement of prognosis as well as therapeutic efficiency. In this study, TOP2A expression was indicated to be negatively correlated with CD8+ T-cell infiltration in UCEC. On the one hand, Tregs' infiltration was confirmed in this study to be positively associated with TOP2A expression in UCEC. Considering Tregs' role in the inhibition of T cell proliferation, we speculated that TOP2A expression may influence the regulation of the process in which T cells are inhibited by Tregs. High expression of TOP2A may cause high level of Treg infiltration, then CD8+ T-cells are inhibited by Tregs that results in the low level of CD8+ T-cell infiltration. On the other hand, CD8+ T-cells' presence is often considered to have an association with favorable prognosis of tumors 44 . High TOP2A expression was statistically associated with poor prognosis for UCEC, so this result suggests the possibility that high TOP2A expression affects the patients' outcomes via the inhibition of CD8+ T-cell infiltration. We also discovered that TOP2A expression in STAD was negatively associated with the infiltration level of B cells, CD8+ T-cells as well as DCs, which may be a factor in the development of STAD. As for neutrophils, whether they function as tumor-antagonizing or tumor-promoting factors depends on tumor types and developmental stages 44 . We found a positive correlation between neutrophils infiltration and TOP2A expression in the case of BLCA, BRCA, BRCA-Basal, COAD, KICH, KIRC, KIRP, LIHC, OV as well as PAAD, hinting the probability of neutrophils' tumor-promoting role in above tumor types. Meanwhile, CAFs' infiltration presented a statistically positive relationship with TOP2A expression in several tumor types, and this result manifested a potential role of CAFs in the tumor occurrence and promotion. Contrarily, we observed a negative relationship between CAFs infiltration and TOP2A expression in BRCA. It has been confirmed that CAFs play a stimulative role in Tregs' infiltration, but high Tregs' infiltration makes CAFs arrested at the G2/M phase 45 . Combining Tregs' close interaction with CAFs and a positive association between TOP2A expression and Tregs' infiltration in BRCA, we considered that high TOP2A expression may promote the arrest of CAFs by Tregs to affect CAFs' growth in BRCA. High expression of TOP2A in BRCA may cause high level of Treg infiltration. Then high Tregs' infiltration makes CAFs arrested at the G2/M phase to affect CAFs' growth, and causes the low level of CAF infiltration. Additionally, M1 macrophages are viewed as antineoplastic factors, and M2 macrophages are regarded as tumorpromoting factors. The negative relationship between TOP2A expression and M1 macrophages' infiltration level in TGCT and THYM suggests that high TOP2A expression may inhibit the function of M1 macrophages to promote the tumor progression. Besides, the positive association between the infiltration level of M2 macrophages and TOP2A expression in GBM hints a possibility that high TOP2A expression stimulates M2 macrophages' tumor-promoting function in GBM.
As for antitumor therapy, TOP2A is viewed as a vital target, but the clinical efficiency of therapy targeting TOP2A may be limited by resistant tumor cells 46 . Combination therapy has been recommended to maximize drugs' clinical effects. It has been reported that the combination therapy targeting CTLA-4 and PD-1 has great effects in the increase of median survival in many tumor types 47 . Additionally, the combination of TIGIT and PDCD1 blockade promotes CD8+ T-cell proliferation and function, and plays a significant role in the improvement of overall survival in preclinical trials 48,49 . To investigate the possibility of synergy between therapy targeting TOP2A and the inhibitors of immune checkpoints, our study analyzed the correlation between TOP2A and six immune checkpoints including CD274, CTLA-4, PD-1, PD-L1, TIM-3, LAG-3 and TIGIT. The positive association between TOP2A expression and the expression of above immune checkpoints hints that combination therapy targeting TOP2A and one of above immune checkpoints may have antitumor synergism and improve therapeutic efficacy. Thus, the interactions between TOP2A and immune checkpoints need to further investigate for more therapeutic choices and the improvement of response rates.
Taken together, our first pan-cancer analyses of TOP2A indicated its widespread over-expression in different cancer types. High expression of TOP2A was related to poor prognosis and advanced pathological stages in most cases, and TOP2A genetic alterations was existed in some tumor types. Furthermore, we built the upstream regulatory networks of TOP2A in KIRC, KIRP, LIHC and LUAD. TOP2A expression was generally positively correlated with cancer associated fibroblasts, M0 and M1 macrophages, and immune checkpoints. Our work of TOP2A pan-analysis contributes to understanding the prognostic and immunological roles and potential upstream molecular mechanism of TOP2A in different cancers.